home *** CD-ROM | disk | FTP | other *** search
-
- J D I C
-
- Simple English Japanese Dictionary Display
- ==========================================
-
- Version 1.4 (August 1991)
-
- Introduction
- ------------
-
- This program provides a simple English/Japanese (kana & kanji) display of
- selected entries of a dictionary file. While it will work (more or less) with
- any text file containing a mix of Japanese and English words, it has been
- designed specifically to operate on a dictionary in the "EDICT" format used by
- the MOKE (Mark's Own Kanji Editor) Japanese text editor.
-
- The executable code and documentation of JDIC is hereby released to the "public
- domain". All usage of this program is at the user's risk, and there is no
- warranty on its performance.
-
- All the Japanese displayed is in kana and kanji, so if you cannot read at least
- hiragana and katakana, this is not the program for you.
-
- Installation
- ------------
-
- This program is distributed as a "zoo" archive (jdic14.zoo) containing the
- following files:
-
- jdic.exe (the executable)
- jdic14.doc (this documentation file)
- edictj (a Japanese/English dictionary file)
- *.bgi (Borland Graphics drivers for various cards)
-
- The files will need to be unpacked and copied into a directory on your hard
- disk. If you are storing them in the same directory as your MOKE files (e.g.
- \kanji) be careful not to overwrite MOKE's "edict" file. In addition, the
- 16-bit JIS font files "k16jis1.fnt" and "k16jis2.fnt" must be in this
- directory. These latter files are not included in this distribution. If you
- use MOKE, you have them already. If not you will need to track them down at
- one of the FTP sites. In the MOKE1.1 distribution they are in the MK11K16.ZIP
- archive file. Another source is kd100.arc, the KD distribution file.
-
- The executable (jdic.exe) will need to be stored in a directory on your path if
- you wish to invoke JDIC from any directory. The simplest approach is to add
- \kanji to your path.
-
- The following environment variables may be set (note that they are the same
- environment variables used by MOKE.)
-
- bgi (the directory containing the bgi files. E.g. c:\tc or c:\kanji. If
- this is not present, the bgi files must be in the directory in which
- JDIC is invoked.)
-
- mokerc (the directory containing the moke.rc file. E.g. c:\kanji. If this
- is not present, the current directory will be searched for a file
- called moke.rc, and the directory details extracted.)
-
- jgraphic (set this to ATT400 if you have an AT&T high-resolution card.
- Otherwise it will default to CGA. NB: MOKE does not use this
- variable.)
-
- If you wish to operate JDIC from other directories, you must have a file
- "moke.rc" containing the following line:
-
- kanjipath directory-path (e.g. C:\KANJI)
-
- to tell the program the location of the control and font files. The
- environment variable "mokerc" must be used to specify the directory containing
- "moke.rc". (If you use MOKE, you will have a moke.rc file already.)
-
- Operation
- ---------
-
- JDIC must operate on a PC or AT with a graphics card. It has been written
- using Turbo C 2.0, and has been tested on VGA, CGA, ATT and HERC cards.
- Auto-detection is used to determine the type of graphics card.
-
- The invocation of JDIC is:
-
- jdic [uses a dictionary called "edict"]
-
- or
-
- jdic dicname [where dicname is the name of your dictionary]
-
- The default dictionary "edict" is, of course, the name of MOKE's
- English/Japanese dictionary file. It will be located in the directory
- specified in your moke.rc file. If you use an alternative dictionary, it can
- be in any directory.
-
- JDIC also needs an index file "<dicname>.jdx". If is not present it will be
- created. JDIC saves the length of the dictionary file and the JDIC version in
- the .jdx file, and if it detects that either have changed, it will insist on
- recreating the index file. Otherwise the dictionary look-up will be useless.
-
- Operation is very simple. After loading the dictionary, index and font files,
- the full-screen working window is displayed with the "Enter Search String:"
- prompt. Type a few letters from the *start* of the word(s) you are seeking.
- JDIC does not match on strings in the middle of words. The scan is
- case-insensitive.
-
- A multi-line display is produced for all the matches against the string. The
- display format is:
-
- matched_word [kanji] (yomikata) english_1, english_2, etc
-
- where "matched_word" is either ascii or kana, depending on the search string.
- If the search string was kana (i.e. the match was on the yomikata), the
- separate yomikata display is omitted.
-
- A line is only displayed once per search, regardless of the number of hits.
-
- After a search, a further prompt occurs at the bottom of the screen giving you
- the option of quitting (Q), requesting another search (A) or, if there is still
- more information to display, requesting the next screen-full (M).
-
- You will notice an "(A)" in the bottom lefthand corner of the screen. This is
- to indicate you are entering search strings in ascii (i.e. in English). If you
- press F3 before entering a string, you toggle between (A)scii, (H)iragana and
- (K)atakana. (Why F3?, well that is the key that MOKE uses for this function.)
-
- To enter a search string in kana, type it in romaji and it will be converted to
- kana as you type. The romaji->kana translation is almost identical to that
- used in MOKE, i.e. for a small "tsu" you can type either a double consonant,
- e.g. "shippai", or "t-", e.g. shit-pai, and for "n" you can type "n'" if
- necessary (e.g. as in "kon'yaku"). Most of the time just typing ordinary
- Hepburn or kunrei romaji works. Note that the romaji must follow the kana
- style for long vowels. Tokyo must be toukyou, NOT tookyoo.
-
- The matching of kana strings is insensitive to whether they are katakana or
- hiragana. The ONE difference between them is that typing a "-" in hiragana
- gets a "u", and in katakana gets a "-", just as in MOKE.
-
- The display is in "dictionary" order for the words matched, i.e. alphabetical
- for the ascii search, and EUC order for the kana search. EUC order is very
- close to the "gojuu" kana order in Japanese dictionaries except that it
- separates the syllables with nigori and maru.
-
- There is also an "Unlimited Display Mode" which is invoked by pressing F1
- before or during the entering of the search string. In this mode you will just
- keep scrolling through the dictionary instead for stopping when you run out of
- matching strings. Also in this mode entries are displayed every time there is
- a match in the index table (normally an entry is displayed once only.) This
- mode is useful for doing maintenance on the dictionary, and for just browsing.
- If you use this mode you may get some strange displays for entries which begin
- with kana and continue with kanji, e.g. ocha.
-
- Dictionary
- ----------
-
- Clearly to be of any use, JDIC must have a reasonably good dictionary. Included
- with this distribution is the EDICTJ dictionary, which is the author's
- extension of MOKE's EDICT. MOKE's EDICT was about 1800 entries, and was
- compiled by Mark Edwards with help from Spencer Green. EDICTJ is about 4600
- entries. (You can rename it `EDICT' and use it with MOKE; it has been called
- EDICTJ to reduce the chance of accidental clobbering.)
-
- EDICTJ and edictj.doc are available separately at monu6 and some other
- archives.
-
- The dictionary file must use the "EUC" coding for Japanese characters. MOKE's
- EDICT does this, so that was the coding adopted in JDIC. Files using JIS
- codings can be converted to EUC using MOKE itself, or Ken Lunde's "JIS.C"
- program.
-
- The format each entry of EDICT is:
-
- Japanese [yomikata] /english_1/english_2/..../
-
- If the word is in kana alone, the yomikata is omitted.
-
- Technical
- ---------
-
- JDIC holds the complete dictionary in RAM, along with the first 3490 bitmaps of
- the JIS character set and the index table. The index table contains an entry
- for each word in the dictionary, sorted in alpha/kana order. This enables a
- fast search to be done, and for the display to be in alphabetical order by
- keyword. Common words like: "of", "to", "the", etc. and grammatical terms
- like: "adj", "vi", "vt", etc. are not indexed.
-
- If a kanji is required that is not in the ~3000 most common ones, it is read
- from disk into a circular cache buffer. This happens rarely.
-
- JDIC can cope with dictionaries up to about 200 kbytes. If a larger dictionary
- ever comes available, another version could operate with the dictionary on
- disk. The parsing and sort to set up the index table would be slower, but the
- searching will still be quite fast.
-
- With release 1.4 the .jdx file can be larger than 64k bytes. To handle this,
- JDIC has to use far pointer arithmetic, so the parsing and sorting is a lot
- slower than in earlier versions. Also all kana and kanji strings are now
- indexed as the author is working on another program which uses the
- edict/edict.jdx files.
-
- Changes in Version 1.1
- ----------------------
-
- o ATT graphics card handling.
- o fixes to the parsing of kanji/kana strings. The result is that the .jdx file
- is about 20% larger than in V1.0.
-
- Changes in Version 1.2
- ----------------------
-
- o fixes to the kana->romaji code to handle "nyu" properly.
- o facility to use dictionaries other than "edict".
- o Unlimited Display Mode.
-
- Changes in Version 1.3
- ----------------------
-
- o immediate romaji->kana conversion (suggested by David Cowhig).
- o examination of the "bgi" and "mokerc" environment variables, and the
- "moke.rc" control file.
-
- Changes in Version 1.4
- ----------------------
-
- o reformatting the output to start with the `hit' word, and to put parentheses
- around the kanji and kana, and to position the ascii text better with respect
- to the kana/kanji.
- o handling of an index table greater than 64k bytes.
-
- It Doesn't Work!
- ----------------
-
- Oh dear. If you do not get the introductory message, you probably have a
- corrupted .exe. Try and get a clean copy. Also your environment might have
- trouble with the output of a Turbo C 2.0 compilation/link.
-
- If you actually get started, but cannot find any thing, even when you put "a"
- as a search key, delete your .jdx file and start again. If it still doesn't
- work, mail the author a sample of your dictionary.
-
- Acknowledgements
- ----------------
-
- A message from the author:
-
- I wrote this program to gain experience in handling and displaying the Japanese
- character set, and to exploit the dictionary that came with my copy of MOKE. I
- also wanted to brush up my C skills. I make no claims for it, but I am pleased
- how it turned out. I will consider releasing the source (if anyone is actually
- interested in it) at a later date.
-
- I welcome suggestions, comments and constructive criticism.
-
- I wrote about two-thirds of this program. Great lumps of it were lifted with
- minor modifications from "KD" (Kanji Driver), which was written by Izumi Ohzawa
- at Berkeley, in particular the JIS handling module (kjis.c) which was a port of
- "jis.pas" by Seiichi Nomura and Seke Wei.
-
- Ken Lunde's "japan.inf" and his elegant "jis.c" explained the workings of EUC
- and old/new JIS codes.
-
- Mark Edwards' MOKE remains the tour de force in this field, and an inspiration
- for us all. I regard JDIC as a humble and minor accessory to MOKE. (I use
- tables lifted from two of the ".hlp" files in MOKE to drive the romaji->kana
- code.)
-
- Jim Breen
- Department of Robotics & Digital Technology
- Monash University
- Melbourne, Australia
- (jwb@monu6.cc.monash.edu.au)
-
- May-August 1991
-